AITopics | gene tree

Collaborating Authors

gene tree

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders

Xie, Tianyu, Richman, Harry, Gao, Jiansi, Matsen, Frederick A. IV, Zhang, Cheng

arXiv.org Machine LearningFeb-7-2025

Learning informative representations of phylogenetic tree structures is essential for analyzing evolutionary relationships. Classical distance-based methods have been widely used to project phylogenetic trees into Euclidean space, but they are often sensitive to the choice of distance metric and may lack sufficient resolution. In this paper, we introduce phylogenetic variational autoencoders (PhyloVAEs), an unsupervised learning framework designed for representation learning and generative modeling of tree topologies. Leveraging an efficient encoding mechanism inspired by autoregressive tree topology generation, we develop a deep latent-variable generative model that facilitates fast, parallelized topology generation. Phylo-VAE combines this generative model with a collaborative inference model based on learnable topological features, allowing for high-resolution representations of phylogenetic tree samples. Extensive experiments demonstrate PhyloVAE's robust representation learning capabilities and fast generation of phylogenetic tree topologies. Phylogenetic trees are the foundational structure for describing the evolutionary processes among individuals or groups of biological entities. Reconstructing these trees based on collected biological sequences (e.g., DNA, RNA, protein) from observed species, also known as phylogenetic inference (Felsenstein, 2004), is an essential discipline of computational biology (Fitch, 1971; Felsenstein, 1981; Yang & Rannala, 1997; Ronquist et al., 2012). Large collections of trees obtained from these approaches (e.g., posterior samples from MCMC runs (Ronquist et al., 2012)), however, are often difficult to summarize or visualize due to the discrete and non-Euclidean nature of the tree topology space The classical approach to visualize and analyze distributions of phylogenetic trees is to calculate pairwise distances between the trees and project them into a plane using multidimensional scaling (MDS) (Amenta & Klingner, 2002; Hillis et al., 2005; Jombart et al., 2017). However, these approaches have the shortcoming that one can not map an arbitrary point in the visualization to a tree, and therefore do not form an actual visualization of the relevant tree space.

phylovae, representation, tree topology, (15 more...)

arXiv.org Machine Learning

2502.0473

Country:

Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Response to Comment on "Ancient origins of allosteric activation in a Ser-Thr kinase"

ScienceNov-19-2020, 18:39:44 GMT

Park et al. question one out of seven findings from Hadzipasic et al.: whether TPX2 allosterically regulates the oldest Aurora. We had already addressed the two concerns raised--sparse sequence sampling and not forcing the gene to the species tree--before publication. Moreover, we believe their ancestral sequence reconstruction would be consistent with a nonallosteric common ancestor, and we show large sequence differences caused by species tree–enforced gene trees. The key findings in Hadzipasic et al. (1) are that (i) autophosphorylation is the ancient allosteric regulation for Aurora kinases; (ii) a gradual increase in allosteric activation took place during the holozoan evolution; (iii) an allosteric network in Aurora exists that, when mutated, alters allosteric activity; (iv) allosteric activation by TPX2 is entirely encoded in the kinase; (v) the interface between Aurora and TPX2 is co-conserved; (vi) evolution of specificity in signaling happens on binding affinity; and (vii) the oldest ancestral Aurora is not allosterically activated by TPX2. Notably, even though the ASR calculations differ, we believe the outcome is consistent with, rather than contradicting, the finding. The two concerns raised are (i) the small number of modern sequences used in the ASR calculations and (ii) the mismatch between the gene tree and the species tree.

gene tree, sequence, species tree, (15 more...)

Science

Country: Asia > Indonesia > Bali (0.05)

Genre: Research Report > New Finding (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Tropical Support Vector Machine and its Applications to Phylogenomics

Tang, Xiaoxian, Wang, Houjie, Yoshida, Ruriko

arXiv.org Machine LearningMar-2-2020

Most data in genome-wide phylogenetic analysis (phylogenomics) is essentially multidimensional, posing a major challenge to human comprehension and computational analysis. Also, we cannot directly apply statistical learning models in data science to a set of phylogenetic trees since the space of phylogenetic trees is not Euclidean. In fact, the space of phylogenetic trees is a tropical Grassmannian in terms of max-plus algebra. Therefore, to classify multi-locus data sets for phylogenetic analysis, we propose tropical Support Vector Machines (SVMs) over the space of phylogenetic trees. Like classical SVMs, a tropical SVM is a discriminative classifier defined by the tropical hyperplane which maximizes the minimum tropical distance from data points to itself in order to separate these data points into open sectors. We show that we can formulate hard margin tropical SVMs and soft margin tropical SVMs as linear programming problems. In addition, we show the necessary and sufficient conditions for each data point to be separated and an explicit formula for the optimal solution for the feasible linear programming problem. Based on our theorems, we develop novel methods to compute tropical SVMs and computational experiments show our methods work well. We end this paper with open problems.

inequality, linear programming, svm, (14 more...)

arXiv.org Machine Learning

2003.00677

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Data Requirement for Phylogenetic Inference from Multiple Loci: A New Distance Method

Dasarathy, Gautam, Nowak, Robert, Roch, Sebastien

arXiv.org Machine LearningJun-30-2014

We consider the problem of estimating the evolutionary history of a set of species (phylogeny or species tree) from several genes. It is known that the evolutionary history of individual genes (gene trees) might be topologically distinct from each other and from the underlying species tree, possibly confounding phylogenetic analysis. A further complication in practice is that one has to estimate gene trees from molecular sequences of finite length. We provide the first full data-requirement analysis of a species tree reconstruction method that takes into account estimation errors at the gene level. Under that criterion, we also devise a novel reconstruction algorithm that provably improves over all previous methods in a regime of interest.

data mining, machine learning, species tree, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/TCBB.2014.2361685

1404.7055

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback